Utilizing lexical data from a web-derived corpus to expand productive collocation knowledge

نویسندگان

  • Shaoqun Wu
  • Ian H. Witten
  • Margaret Franken
چکیده

Collocations are of great importance for second language learners, and a learner’s knowledge of them plays a key role in producing language fluently (Nation, 2001: 323). In this article we describe and evaluate an innovative system that uses a Web-derived corpus and digital library software to produce a vast concordance and present it in a way that helps students use collocations more effectively in their writing. Instead of live search we use an off-line corpus of short sequences of words, along with their frequencies. They are preprocessed, filtered, and organized into a searchable digital library collection containing 380 million five-word sequences drawn from a vocabulary of 145,000 words. Although the phrases are short, learners can browse more extended contexts because the system automatically locates sample sentences that contain them, either on the Web or in the British National Corpus. Two evaluations were conducted: an expert user tested the system to see if it could generate suitable alternatives for given text fragments, and students used it for a particular exercise. Both suggest that, even within the constraints of a limited study, the system could and did help students improve their writing.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Acquiring Collocations For Lexical Choice Between Near-Synonyms

We extend a lexical knowledge-base of near-synonym differences with knowledge about their collocational behaviour. This type of knowledge is useful in the process of lexical choice between near-synonyms. We acquire collocations for the near-synonyms of interest from a corpus (only collocations with the appropriate sense and part-of-speech). For each word that collocates with a nearsynonym we us...

متن کامل

The Impact of Teaching Corpus-based Collocation on EFL Learners' Writing Ability

Abstract The present study explores the impact of corpus-based collocation instruction on intermediate Iranian EFL learners' writing ability. For this study, 84 Iranian learners, studying English as a foreign language in Bayan Institute, Iran, were selected and were randomly divided into two groups, experimental and control. Conventional methods of writing instruction were taught to the control...

متن کامل

The Impact of Teaching Corpus-based Collocation on EFL Learners' Writing Ability

Abstract The present study explores the impact of corpus-based collocation instruction on intermediate Iranian EFL learners' writing ability. For this study, 84 Iranian learners, studying English as a foreign language in Bayan Institute, Iran, were selected and were randomly divided into two groups, experimental and control. Conventional methods of writing instruction were taught to the control...

متن کامل

Supporting collocation learning with a digital library

Extensive knowledge of collocations is a key factor that distinguishes learners from fluent native speakers. Such knowledge is difficult to acquire simply because there is so much of it. This paper describes a system that exploits the facilities offered by digital libraries to provide a rich collocation-learning environment. The design is based on three processes that have been identified as le...

متن کامل

Distributional Lexical Semantics for Stop Lists

In this paper, we consider the use of techniques that lead naturally towards using distributional lexical semantics for the automatic construction of corpora-specific stop word lists. We propose and evaluate a method for calculating stop words based on collocation, frequency information and comparisons of distributions within and across samples. This method is tested against the Enron email cor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • ReCALL

دوره 22  شماره 

صفحات  -

تاریخ انتشار 2010